skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Kay, Matthew"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available April 25, 2026
  2. Free, publicly-accessible full text available April 25, 2026
  3. Free, publicly-accessible full text available January 1, 2026
  4. Multiverse analyses involve conducting all combinations of reasonable choices in a data analysis process. A reader of a study containing a multiverse analysis might question—are all the choices included in the multiverse reasonable and equally justifiable? How much do results vary if we make different choices in the analysis process? In this work, we identify principles for validating the composition of, and interpreting the uncertainty in, the results of a multiverse analysis. We present Milliways, a novel interactive visualisation system to support principled evaluation of multiverse analyses. Milliways provides interlinked panels presenting result distributions, individual analysis composition, multiverse code specification, and data summaries. Milliways supports interactions to sort, filter and aggregate results based on the analysis specification to identify decisions in the analysis process to which the results are sensitive. To represent the two qualitatively different types of uncertainty that arise in multiverse analyses—probabilistic uncertainty from estimating unknown quantities of interest such as regression coefficients, and possibilistic uncertainty from choices in the data analysis—Milliways uses consonance curves and probability boxes. Through an evaluative study with five users familiar with multiverse analysis, we demonstrate how Milliways can support multiverse analysis tasks, including a principled assessment of the results of a multiverse analysis. 
    more » « less
  5. Visualization literacy is an essential skill for accurately interpreting data to inform critical decisions. Consequently, it is vital to understand the evolution of this ability and devise targeted interventions to enhance it, requiring concise and repeatable assessments of visualization literacy for individuals. However, current assessments, such as the Visualization Literacy Assessment Test ( vlat ), are time-consuming due to their fixed, lengthy format. To address this limitation, we develop two streamlined computerized adaptive tests ( cats ) for visualization literacy, a-vlat and a-calvi , which measure the same set of skills as their original versions in half the number of questions. Specifically, we (1) employ item response theory (IRT) and non-psychometric constraints to construct adaptive versions of the assessments, (2) finalize the configurations of adaptation through simulation, (3) refine the composition of test items of a-calvi via a qualitative study, and (4) demonstrate the test-retest reliability (ICC: 0.98 and 0.98) and convergent validity (correlation: 0.81 and 0.66) of both CATS via four online studies. We discuss practical recommendations for using our CATS and opportunities for further customization to leverage the full potential of adaptive assessments. All supplemental materials are available at https://osf.io/a6258/ . 
    more » « less
  6. Visualization grammars, often based on the Grammar of Graphics (GoG), have much potential for augmenting data analysis in a programming environment. However, we do not know how analysts conceptualize grammar abstractions, or how a visualization grammar works with data analysis in practice. Therefore, we qualitatively analyzed how experienced analysts (N = 6) from TidyTuesday, a social data project, wrangled and visualized data using GoG-based ggplot2 without given tasks in R Markdown. Though participants’ analysis and customization needs could mismatch with GoG component design, their analysis processes aligned with the goal of GoG to expedite visualization iteration. We also found a feedback loop and tight coupling between visualization and data transformation code, explaining both participants’ productivity and their errors. From these results, we discuss how future visualization grammars can become more practical for analysts and how visualization grammar and analysis tools can better integrate within a programming (i.e., computational notebook) environment. 
    more » « less
  7. We propose a new approach to uncertainty communication: we keep the uncertainty representation fixed, but adjust the distribution displayed to compensate for biases in people’s subjective probability in decision-making. To do so, we adopt a linear-in-probit model of subjective probability and derive two corrections to a Normal distribution based on the model’s intercept and slope: one correcting all right-tailed probabilities, and the other preserving the mode and one focal probability. We then conduct two experiments on U.S. demographically-representative samples. We show participants hypothetical U.S. Senate election forecasts as text or a histogram and elicit their subjective probabilities using a betting task. The first experiment estimates the linear-in-probit intercepts and slopes, and confirms the biases in participants’ subjective probabilities. The second, preregistered follow-up shows participants the bias-corrected forecast distributions. We find the corrections substantially improve participants’ decision quality by reducing the integrated absolute error of their subjective probabilities compared to the true probabilities. These corrections can be generalized to any univariate probability or confidence distribution, giving them broad applicability. Our preprint, code, data, and preregistration are available at https://doi.org/10.17605/osf.io/kcwxm 
    more » « less
  8. Visualization misinformation is a prevalent problem, and combating it requires understanding people’s ability to read, interpret, and reason about erroneous or potentially misleading visualizations, which lacks a reliable measurement: existing visualization literacy tests focus on well-formed visualizations. We systematically develop an assessment for this ability by: (1) developing a precise definition of misleaders (decisions made in the construction of visualizations that can lead to conclusions not supported by the data), (2) constructing initial test items using a design space of misleaders and chart types, (3) trying out the provisional test on 497 participants, and (4) analyzing the test tryout results and refining the items using Item Response Theory, qualitative analysis, a wrong-due-to-misleader score, and the content validity index. Our final bank of 45 items shows high reliability, and we provide item bank usage recommendations for future tests and different use cases. Related materials are available at: https://osf.io/pv67z/. 
    more » « less
  9. The grammar of graphics is ubiquitous, providing the foundation for a variety of popular visualization tools and toolkits. Yet support for uncertainty visualization in the grammar graphics—beyond simple variations of error bars, uncertainty bands, and density plots—remains rudimentary. Research in uncertainty visualization has developed a rich variety of improved uncertainty visualizations, most of which are difficult to create in existing grammar of graphics implementations. ggdist , an extension to the popular ggplot2 grammar of graphics toolkit, is an attempt to rectify this situation. ggdist unifies a variety of uncertainty visualization types through the lens of distributional visualization, allowing functions of distributions to be mapped to directly to visual channels (aesthetics), making it straightforward to express a variety of (sometimes weird!) uncertainty visualization types. This distributional lens also offers a way to unify Bayesian and frequentist uncertainty visualization by formalizing the latter with the help of confidence distributions. In this paper, I offer a description of this uncertainty visualization paradigm and lessons learned from its development and adoption: ggdist has existed in some form for about six years (originally as part of the tidybayes R package for post-processing Bayesian models), and it has evolved substantially over that time, with several rewrites and API re-organizations as it changed in response to user feedback and expanded to cover increasing varieties of uncertainty visualization types. Ultimately, given the huge expressive power of the grammar of graphics and the popularity of tools built on it, I hope a catalog of my experience with ggdist will provide a catalyst for further improvements to formalizations and implementations of uncertainty visualization in grammar of graphics ecosystems. A free copy of this paper is available at https://osf.io/2gsz6 . All supplemental materials are available at https://github.com/mjskay/ggdist-paper and are archived on Zenodo at doi:10.5281/zenodo.7770984 . 
    more » « less